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Abstract 



Important applications in robotic and sensor networks require dis- 
tributed algorithms to solve the so-called relative localization problem: 
a node-indexed vector has to be reconstructed from measurements of dif- 
ferences between neighbor nodes. In a recent note, we have studied the 
estimation error of a popular gradient descent algorithm showing that 
the mean square error has a minimum at a finite time, after which the 
performance worsens. This paper proposes a suitable modification of this 
algorithm incorporating more realistic a priori information on the posi- 
tion. The new algorithm presents a performance monotonically decreasing 
to the optimal one. Furthermore, we show that the optimal performance 
is approximated, up to a 1 -f £ factor, within a time which is indepen- 
dent of the graph and of the number of nodes. This convergence time is 
very much related to the minimum exhibited by the previous algorithm 
and both lead to the following conclusion: in the presence of noisy data, 
cooperation is only useful till a certain limit. 

1 Introduction 

We study in this paper the distributed solution of a problem of relative local- 
ization in a network of sensors. We assume to have a group of agents organized 
in a graph and a vector, indexed over the agents and unknown to them: the 
agents are allowed to take relative noisy measurements of their vector entries 
with respect to their neighbors in the graph. The estimation problem consists 
in reconstructing the original vector, up to an additive constant. We refer to 
this problem as the problem of relative localization. 



Contribution 

In our previous work [T] , we studied the performance of a distributed algorithm, 
obtained as a gradient descent solution after of a least-squares formulation of 
the localization problem. The mean square estimation error of this algorithm 
has a minimum at a finite time, after which the performance worsens. This 
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non-monotonic behavior, although very interesting from a theoretical point of 
view, may be seen as a potential drawback of the algorithm. For this reason, 
in the present paper we build on the insights gained from our previous work to 
present an algorithm with monotonic mean square error performance. 

As the main contribution of this work, we define an e-convergence time for 
the algorithm and wc find an upper bound on it, which has the remarkable 
feature of being independent of the network and even of number of sensors. No- 
tably, also the minimum time of the algorithm in [T] has an upper bound which 
is independent of the graph. Both these observations suggest that cooperation 
provides limited benefit in reconstructing estimates from measurements which 
are affected by noise. Indeed, a bounded optimal time means that there is no 
advantage for a node in obtaining data from outside a certain neighborhood. 
Intuitively, communication with sensors which are far away in the network does 
not contribute enough significant information: then, the noise which corrupts 
the data makes it useless (in the algorithm below) or even misleading (in the 
more nai've algorithm in [1]). 

Related Literature 

The problem of relative localization has been brought to our attention in the 
formulation of [U 131 H], which is slightly different from ours, as these authors 
assume to have an anchor node, in order to avoid the uncertainty about the 
additive constant. The natural applications of this estimation problem include 
spacial localization and clock synchronization [5j |6l [7] . Distributed algorithms 
have been proposed in several papers, including [2j [5l [8], and contemporary 
work is focusing on randomized algorithms [HI IIOI 111) . 

Paper organization 

In Section [2] we define the problem of relative localization, and our novel algo- 
rithm for its solution is derived in Section [31 Then in Section [31 we analytically 
study the convergence and the mean square error of the algorithm, while sim- 
ulations are described in Section [5] We conclude with a short section which 
summarizes our contribution and points to future research. 

Notation 

Vectors are denoted with boldface letters, and matrices with capital letters. 
By the symbols 1 and we denote vectors having all entries equal to 1 and 
0, respectively. Given a matrix M, we denote by tr(Af) its trace, by its 
transpose and by M'^ its Moore-Penrose pseudo-inverse. 

2 The relative localization problem 

We consider a set of N agents, and we endow each of them with a scalar quantity 
Xi e R, for i e {0, . . . , — 1}. The ith. agent does not know the value Xi, but 
has an estimate Xi G R. We shall denote by x and x the A^-dimensional vectors 
whose components are Xi and Xi, respectively. Wc suppose that each agent i 
can take relative measurements Xi — Xj with respect to some neighbors j. An 



undirected graph G = {{0,...,N — 1},E) is used to represent the available 
measurements. The set of vertices is constituted by the N agents, and the 
edges (pairs of agents) in E correspond to the available measurements. We 
assume that there are AI available measurements, and that measurements are 
symmetrical, meaning that both agents of a pair know the measurement, with 
a reversed sign. Furthermore, we assume that the graph G is connected. On 
each edge, we choose an orientation, that is, we define a starting node and an 
ending node, in order to encode the measurements by using the incidence matrix 
^ g rMxjv defined as follows 

{1 if i is the terminating edge of e 
— 1 if i is the starting edge of e 
otherwise. 

Measurements are affected by errors, which can be modeled by independent and 
identically distributed noises. Let b G M^^ be the vector of the measurements 
and n e R^^ that of noises. Then, in matrix notation we have 

b = Ax + n 

with E[n] =0 and E [nn^] a^I where / G R^^^^^ is the identity matrix. 
It is also useful to define the Laplacian of G as i = A. The Laplacian L 
is a symmetric matrix, and being G connected, L has eigenvalues Aq = and 
< Ai < 2(iniax forig{l,...,A^ — 1}, with dmax denoting the maximum degree 
of the nodes. 



3 Definition of the algorithm 

In view of the statistical assumptions on the noise affecting the measurements, a 
natural approach to the relative localization problem involves solving the least- 
squares problem 

min \\Ax - b||f. 

This approach has already been taken in the literature, and leads to design 
the distributed algorithm studied in [T]. In this paper, we additionally as- 
sume that each node i has an a priori information on x, which is known 
to be a random vector independent from n and such that E [x] ~ xq and 
E [(x — Xo)(x — xq)^] ~ v'^I ■ In order to exploit this statistical information, 
we choose to minimize the functional 

cI>(x) = ij||Ax-b||^ + 4!|x-x„||^, 

which includes both the information obtained by the measurements and the 
a priori knowledge about x, weighted according to their significance, i.e., the 
inverse of their variances. Compared to Vl'(x), the extra term in this functional 
can be seen as a Tikhonov regularization term, which turns the estimation 
problem at hand into a problem of maximum a posteriori probability (MAP) 
estimation. We refer the reader to [T21 §6.3.2 and §7.1.2] for a broad introduction 
to these concepts. 

As $ is convex, it is natural to consider gradient descent algorithms for its 
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minimization. Provided we define 7 = ^ , the gradient of the objective function 



is V<i>(a;) = ^ (^A^ Ax — A^h + 7(x — xq)) , so that a gradient descent iterate 
can be defined as 

x[t + l] =x[i] -r-^V$(x[t]) 

= x[t] - T [(^T Ax[t] - A^b) + 7 (x[i] - xo)] 
= {I - tL- T7/)x[t] + rA^h + T7X0 

for a suitable r > 0. Equivalently, we may write the algorithm as 

r x[i + 1] = Qx[i] + w 
1 x[0] - xo 

where 

= / - rL - r7/ = (1 - r7)/ - rL (2) 

and 

w = rA^h + T7X0. 

Remarkably, this algorithm is distributed, in the following sense. The matrix 
Q is adapted to the graph G, i.e„ Qij = if ^ E: then, in order to update a 
component as a::i[<+l] = J2j Qij^j M+^i, the algorithm requires communication 
and measurements only with the nodes which are neighbors of i in the graph. 

4 Analysis 

In the analysis of algorithm ([1]) and from here on in this paper, we shall make 
the following assumption, which is sufficient to our results. 

Assumption 1. The graph G is connected and 

1 

^ < 3 ■ 

CSmax + 7 

4.1 Stability properties 

We begin our analysis by studying the convergence properties of the proposed 
algorithm. 

Proposition 1 (Convergence). // Assumption [I] is satisfied, then the algo- 
rithm ([T]) converges at exponential rate to 

X* = (A^A + 7/)-i(ATb + 7Xo), 

which is the optimal solution to the problem 

min $(x) 

X 

Proof. First, we show that x* is the optimal solution of the optimization prob- 
lem. To this goal, we equate to the zero of the gradient V'I'(x) and solve the 
normal equation 

{A^A + 7/)x = A^h + 7x0 



Since 7 > 0, the matrix A + '^I = L + jI is invcrtible, and hence the optimal 
sohition is unique and equal to x* . 

Second, we show that the algorithm converges to x* . By solving the recur- 
sion we have 



t-i 



x[i] = Q*xo + J2 (3) 



n=0 



Since Q = (1 — tj)I — rL, also Q is diagonalizable with real eigenvalues 
1 — T"f — rXi . Using Assumption (TJ W wc have 
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Therefore, given the assumptions on r, all the eigenvalues of Q belong to the 
interval [—1 + r7, 1 — T7] (note that tj < 1), the algorithm is exponentially 
convergent, and limt^oo Q* = 0. Then, wc can compute 

t-i 

x[t] - 0*xo + (/ - - 0) ^ Q"w 

n=0 

= Q*xo + (/-0)-i(/-Q*)w 

and consequently 

lim x[t] = (/ - 0)w 

= (L + 7/)"'(A^b + 7Xo) -X*. 

□ 

Remark 1 (Average preservation). The algorithm preserves the bary center (or 
average) of the state, namely j^l^x[t] = -^I^xq. Remarkably, this property 
holds even if Q is not stochastic. Notice indeed that l^w = T7l^xo and that 
l^x[t + 1] = (1 — r7)l^x[i] + T7l^xo. Since x[0] = xq, by induction the 
harycenter is preserved. This property is also consistent with the intuition that 
the optimal solution must satisfy -^l^x* = -^l^xo. □ 

4.2 Transient mean-square performance 

To evaluate the algorithm performance, wc follow the approach in fl3| and define 
the performance metric as the mean square error between the current estimate 
x[i] and the true configuration x, that is, 

Ht:=^E||x[i]-x||i 

where the expectation is taken on both the noise n and the initial condition xq. 
This performance metric can be computed in terms of the eigenvalues of the 
matrix Q. 



Proposition 2 (Mean square performance). If Assumption{l\is satisfied, then 
the following equality holds 



^2 N-l 



N 



St 



i=0 1=0 

where 's are the eigenvalues of Q. 
Proof. We express w in terms of n and xq — x as 
w = tA^ Ax + n + T7X0 

= (/ — T"/I — Q) X + tA'^u + T^/Xq 

= (/ - 0) i + Tvl^n + r7(xo - x) 

Now, we compute x[t] — x, given w and ^ as 

t-i 

x[t] - X = g*(xo - i) + T7 ^ g"(xo - x) 

n=0 
t-1 



n=0 

From the definition of Ht we have 

i/t = ^E [tr[(x[t]-x)(x[i]-x)^]] 

By using the above formula for x[t] — x. we get 

t-i 



Ht =^ tr 



N 



m=0 



t-1 t-1 



n—O m—0 



through some algebraic manipulations -which we omit- involving the properties 

of the trace operator, the linearity of expectation and the symmetry of Q. Now, 
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given that 7 = ^ , we obtain 



Ht = 4 tr 



N 
1 

TV 
1 



f-i 



^02t + ^^2(^^gt)^g. 



n=0 



tr [l.2Q2t ^ ^^2(_^ ^ gt)(^ _ Qt)(^ _ Q)-l] 



= ^tr [^2Q2t^^^2(^_Q2t)(^_Q)-l]^ 

Notice that the matrix (/ — Q) is invertible since it is proportional to L + 7/. 
The result follows immediately as ^^s are the eigenvalues of Q. □ 



The key property of monotonicity of Ht is stated in the next result. 



Theorem 3 (Monotonicity of Ht). If Assumption [1\ is satisfied, then Ht is 
strictly decreasing and 



Hoo := lim Ht = 



4=0 

Proof. Let us recall the definition of 7 and define a new constant a, according 
to what done in [T]: 

— — = — (4) 

2 

Note that, given Assumption [11 a > 1 + d„^s.^^. Keeping this inequality in 
mind we can rewrite Ht as 



Ht^ — y 



N 

4=0 



2t 



We will show that Ht is decreasing in i, since the i"^ term in the sum is either 
a constant or decreasing sequence. Let us compute the finite increment 

2 ^-1 

Ht+,-Ht^^^Y.[&' K^-" + l + 6)] 

1=0 

2 ^-1 

4=0 

with h{i) ^ + i + l - a. Note that > for aU ^ 7^ whereas h{£) < 
when ^ G (—1, 1 — ^) = ( — 1, 1 — Tj). Since G [—1 + r7, 1 — t^) when i > 0, 
the corresponding contribution in Ht is a decreasing sequence (unless = 0). 

The contribution of ^0 = 1 ^ ''■7 in Ht is constant, since a^Q H — = a. This 
corresponds to the invariance of the barycenter. 

The sequence Ht is bounded and monotonic, so it has a limit that we can 
also compute explicitly as 

TO- X - 1 



lim Ht — / 

i-)-+oo TV ^ 1 - 6 

4 = 

2 1 



4=0 ' * 

where are the eigenvalues of the Laplacian of the graph. □ 

Remark 2 (Meaning of Hoo)- It is worth to recall that the asymptotical error, 
which we can also write as Hoo = ■^IE||x* — xjH, only depends on the properties 
of X* as the solution of the regularized least-squares problem: hence it does not 
depend on the algorithm. 



4.3 Near-optimal stopping time 



For every e > 0, we can define a near-optimal stopping time, after which the 
estimation error is only a (1 + e) factor larger than the optimal one: 

t* =inf{t : Ht < il+e)H^}. 

The following estimate shows that the algorithm can be stopped, with a 
guaranteed loss of accuracy with respect to the regularized least-squares opti- 
mum, after a time which does not depend the graph or even on the number of 
sensors. 

Proposition 4 (Universal bound on stopping time). If Assumption]^ is satis- 
fied, then it holds 

a , (2a 



i:<^iog(-), (5) 



where a is defined in (|4|. 

Proof. From the definition we immediately deduce that 



By taking an upper bound on the second term of the left-hand side of the 
inequality, we have 

{N-l N-1 ~) 

* E<"<Et^ • 

Since Assumption [T] implies < 1 — r-f and > 2-T-y > 51 we have 

<inf{t : «(l~r7)2* < £}. 
By solving for t in the above inequality we get 

log(¥) 



t* < 



los 



and then the result follows. □ 



5 Simulations and comparison with pj] 

We have simulated algorithm ((T|) and numerically evaluated the related per- 
formance metrics, assuming the graph to be a cycle. Figure [T] compares the 
simulated and expected performance of the algorithm. Notice that, although 
the expected error Ht is monotonic, single realizations need not to be mono- 
tonic, and indeed some of them show a minimum. The figure also shows the 
actual near-optimal time t* , in comparison with its estimate obtained in Propo- 
sition S) We notice that the estimate is significantly larger than the true value: 
this looseness is not surprising, as our bound does not exploit any information 



10= 



Ht 

^l|xM-x||2 
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Figure 1: Mean square error of algorithm [T] on a cycle graph with N = 160, 
jy = 20, fj = 1, e = 0.01. 

about the topology of the sensing and communication graph, which is likely to 
have a role. Hence, future research may improve upon our bounds by a careful 
use of information about the spectrum of Q, i.e., on the graph. 

The second goal of our simulations is to compare algorithm ([1]) with the 
analogous algorithm defined in [1] Eq. (1)], based on based on minimizing ^'(x). 
Hence, Figure [2] plots for both algorithms the mean square error, together with 
the mean square error of a few single realizations. Wc can sec that the perfor- 
mance of the two algorithms is roughly similar (in expectation) until the algo- 
rithm in [T] reaches a time at which its mean square error is minimal. From that 
time on, the behavior of the two algorithms becomes different, as algorithm [TJ 
Eq. (1)] accumulates an increasingly larger mean square error, whereas the error 
of algorithm ([T]) decreases further. We leave to future research a more detailed 
comparison of the two algorithms, which should include a discussion on the be- 
havior of single realizations, as opposed to the average performance which has 
been studied so far. 

6 Conclusion 

In this paper, we have studied a distributed algorithm to solve the relative 
localization problem in sensor networks. Compared to algorithms available on 
literature, the proposed algorithm has an improved performance for large times: 
moreover, the algorithm is guaranteed to reach (on average) an e-approximation 
of the optimal solution within a time which only grows logarithmically in e and 
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Figure 2: Mean square error of algorithms ([T]) and [1] Eq. (1)] on a cycle graph 
with N ^ 160, ly = 20, and a = I. 

does not depend on either the topology of the sensor network or the number 
of sensors. We interpret this feature as an inherent limitation on the benefit 
of cooperation. Future research should put our results in a broader context, 
investigating the fundamental issue of quantifying the benefit of cooperation (if 
any), depending on the "cooperation task" which is assigned to the agents, as 
well as on the available communication and the measurement models: a recent 
example of work in this direction is |14j . 
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